Enable QNN EP weight sharing generation using public API #23702

HectorSVC · 2025-02-14T16:23:42Z

Description

Enable QNN EP weight sharing generation using public API instead of internal interfaces, so that user can integrate into their own toolchain. The change is to share the QnnBackendManager across ORT sessions if ep.share_ep_contexts is enabled. And there is extra option to end the share so that we know when to remove the shared QnnBackendManager from the singleton.

Change the tool name from onnxruntime_qnn_ctx_gen to ep_weight_sharing_ctx_gen, so that it can be shared for other EPs.

…s if ep.share_ep_contexts is enabled

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/providers/qnn/qnn_ep_context_test.cc

adrianlizarraga · 2025-02-20T17:38:16Z

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc

 #include "command_args_parser.h"
 #include <google/protobuf/stubs/common.h>

 #include "core/session/onnxruntime_session_options_config_keys.h"
-#include "core/session/inference_session.h"
 #include "core/session/ort_env.h"


Hi, I believe this is still an internal header. And I also see that this tool still depends on the internal graph classes onnxruntime::Graph and onnxruntime::Node, which have a public header but a private/internal implementation. Since this still requires the tool to be compiled with internal ORT code, would this prevent users from integrating this into their own toolchains?

Good point. Actually for that post processing part, user should be able to use Onnx API to update the Onnx model. That's not the main part we want to cover in this tool. But anyway, let me make the changes to use Onnx API to make it clear.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc

jywu-msft · 2025-02-21T06:58:56Z

onnxruntime/test/ep_weight_sharing_ctx_gen/README.md

    -v: Show verbose information.

    -C: [session_config_entries]: Specify session configuration entries as key-value pairs: -C "<key1>|<val1> <key2>|<val2>"
                                  Refer to onnxruntime_session_options_config_keys.h for valid keys and values.
-                                  [Example] -C "ep.context_enable|1 ep.context_embed_mode|0"
+                                  [Example] -C "ep.context_enable|1 ep.context_embed_mode|0". These are set as default so can be ignored.

    -i: [provider_options]: Specify QNN EP specific runtime options as key value pairs. Different runtime options available are:


-i help string still mentions QNN EP

jywu-msft · 2025-02-21T06:59:44Z

onnxruntime/test/providers/qnn/qnn_ep_context_test.cc

@@ -43,6 +43,35 @@ static const std::string& GetNodeAttr(const Node& node, const std::string& attr_
  return default_val;
 }

+// from the context ache Onnx model, find the EPContext node with main_context=1,


adrianlizarraga · 2025-02-21T07:03:31Z

cmake/onnxruntime_unittests.cmake

    endif()
-    target_link_libraries(onnxruntime_qnn_ctx_gen PRIVATE onnx_test_runner_common onnxruntime_test_utils onnxruntime_common onnxruntime_graph onnxruntime_session onnxruntime_providers onnxruntime_framework onnxruntime_util onnxruntime_mlas onnxruntime_optimizer onnxruntime_flatbuffers onnx_test_data_proto ${onnxruntime_test_providers_libs} ${onnxruntime_EXTERNAL_LIBRARIES} ${GETOPT_LIB_WIDE} ${SYS_PATH_LIB} ${CMAKE_DL_LIBS})
+    target_link_libraries(ep_weight_sharing_ctx_gen PRIVATE onnx_test_runner_common onnxruntime_test_utils onnxruntime_common onnxruntime_graph onnxruntime_session onnxruntime_providers onnxruntime_framework onnxruntime_util onnxruntime_mlas onnxruntime_optimizer onnxruntime_flatbuffers onnx_test_data_proto ${onnxruntime_test_providers_libs} ${onnxruntime_EXTERNAL_LIBRARIES} ${GETOPT_LIB_WIDE} ${SYS_PATH_LIB} ${CMAKE_DL_LIBS})


Can some of these internal libraries be removed now that the tool uses public APIs?

HectorSVC · 2025-02-24T05:39:52Z

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc

+#endif
+
+        if (test_config.model_file_paths.size() > 2) {
+          std::cerr << "QNN EP only support 2 models for the weight sharing feature.";


There are some cases that more than 2 models share the weight.

HectorSVC added 8 commits February 2, 2025 22:15

rename onnxruntime_qnn_ctx_gen to ep_weight_sharing_ctx_gen

69b662d

rename

9ec8b1f

update the code to use C++ API

906cfd5

Merge branch 'main' into ep_weight_sharing_ctx_gen

eb6104b

When generate EPContext model, share QnnBackendManager across session…

eaf7041

…s if ep.share_ep_contexts is enabled

Add -e option to make it work for other EPs

3800cbc

Merge branch 'main' into ep_weight_sharing_ctx_gen

1a135ea

update comments

c8cfc2e

HectorSVC added the ep:QNN issues related to QNN exeution provider label Feb 14, 2025

add new UT

c8eb395

github-actions bot reviewed Feb 15, 2025

View reviewed changes

onnxruntime/test/providers/qnn/qnn_ep_context_test.cc Outdated Show resolved Hide resolved

onnxruntime/test/providers/qnn/qnn_ep_context_test.cc Outdated Show resolved Hide resolved

HectorSVC added 3 commits February 14, 2025 22:40

fix UT

0b3dbed

Merge branch 'main' into ep_weight_sharing_ctx_gen

ff59f37

remove include folder not used

de47a5c

adrianlizarraga reviewed Feb 20, 2025

View reviewed changes

remove all ORT internal dependencies, use ONNX API for model file update

dea89d7

github-actions bot reviewed Feb 20, 2025

View reviewed changes

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc Outdated Show resolved Hide resolved

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc Outdated Show resolved Hide resolved

github-advanced-security bot found potential problems Feb 20, 2025

View reviewed changes

onnxruntime/test/ep_weight_sharing_ctx_gen/main.cc Fixed Show fixed Hide fixed

HectorSVC added 2 commits February 20, 2025 14:40

update README.md

aec117c

format

92d8423

jywu-msft reviewed Feb 21, 2025

View reviewed changes

adrianlizarraga reviewed Feb 21, 2025

View reviewed changes

HectorSVC commented Feb 24, 2025

View reviewed changes

address review comments

57783ae

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable QNN EP weight sharing generation using public API #23702

Enable QNN EP weight sharing generation using public API #23702

HectorSVC commented Feb 14, 2025 •

edited

Loading

github-actions bot left a comment

adrianlizarraga Feb 20, 2025

HectorSVC Feb 20, 2025

github-actions bot left a comment

jywu-msft Feb 21, 2025

jywu-msft Feb 21, 2025

adrianlizarraga Feb 21, 2025

HectorSVC Feb 24, 2025

Enable QNN EP weight sharing generation using public API #23702

Are you sure you want to change the base?

Enable QNN EP weight sharing generation using public API #23702

Conversation

HectorSVC commented Feb 14, 2025 • edited Loading

Description

github-actions bot left a comment

Choose a reason for hiding this comment

adrianlizarraga Feb 20, 2025

Choose a reason for hiding this comment

HectorSVC Feb 20, 2025

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

jywu-msft Feb 21, 2025

Choose a reason for hiding this comment

jywu-msft Feb 21, 2025

Choose a reason for hiding this comment

adrianlizarraga Feb 21, 2025

Choose a reason for hiding this comment

HectorSVC Feb 24, 2025

Choose a reason for hiding this comment

HectorSVC commented Feb 14, 2025 •

edited

Loading